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DOCUMENT-IDENTIFIER: US 6865582 B2 

TITLE: Systems and methods for knowledge discovery in spatial data 



Abstract Text (1) : 

Systems and methods are provided for knowledge discovery in spatial data as well as 
to systems and methods for optimizing recipes used in spatial environments such as 
may be found in precision agriculture. A spatial data analysis and modeling module 
is provided which allows users to interactively and flexibly analyze and mine 
spatial data. The spatial data analysis and modeling module applies spatial data 
mining algorithms through a number of steps. The data loading and generation module 
obtains or generates spatial data and allows for basic partitioning. The inspection 
module provides basic statistical analysis. The preprocessing module smoothes and 
cleans the data and allows for basic manipulation of the data. The partitioning 
module provides for more advanced data partitioning. The prediction module applies 
regression and classification algorithms on the spatial data. The integration 
module enhances prediction methods by combining and integrating models. The 
recommendation module provides the user with site-specific recommendations as to 
how to optimize a recipe for a spatial environment such as a fertilizer recipe for 
an agricultural field. 

Detailed Description Text (73) : 

The recommendation module 222 may provide different types of information. For 
example, the recommendation module 222 could be converted into a fertilizer module, 
meaning that the parameter that is evaluated is how much fertilizer should be 
applied to each point based on the spatial data analysis . Or the recommendation 
module could be converted into an irrigation module which would evaluate how much 
to irrigate the field at predetermined points. Other examples include pesticide 
module, herbicide module, seed-variety spacing module, and the like. 

Current US Original Classification (1) : 
707/104.1 
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DOCUMENT-IDENTIFIER: US 6836773 B2 

TITLE: Enterprise web mining system and method 



Abstract Text (1) : 

An enterprise-wide web data mining system, computer program product, and method of 
operation thereof, that uses Internet based data sources, and which operates in an 
automated and cost effective manner. The enterprise web mining system comprises: a 
database coupled to a plurality of data sources, the database operable to store 
data collected from the data sources; a data mining engine coupled to the web 
server and the database, the data mining engine operable to generate a plurality of 
data mining models using the collected data; a server coupled to a network, the 
server operable to: receive a request for a prediction or recommendation over the 
network, generate a prediction or recommendation using the data mining models, and 
transmit the generated prediction or recommendation. 

Detailed Description Text (51) : 

Data preprocessing engine 903 provides the extraction and transformation 
components, which extract data from web logs and other corporate information 
sources and transform it into a form suitable for data mining model construction. 
There are several main sub-components of data preprocessing engine 903. The mapping 
and selection component reads corporate database tables, such as those from 
corporate data sources 914, and maps specific fields into the account-based mining 
tables The web data transformation component reads raw log files, and optionally 
transaction summaries, from external data sources 916, and converts them into the 
transaction-based mining schema (TBMS) used by present invention. The web data 
transformation component also performs semantic analysis and keyword extraction on 
the original and converted web data to produce conceptual tables, concept-based 
mining schema (CBMS) . 

Current US Original Classification (1) : 
707/6 

CLAIMS : 

1. A computer-implemented method of enterprise web mining comprising the steps of: 
collecting data from a plurality of data sources, including proprietary corporate 
data comprising proprietary account or user-based data, external data comprising 
data acquired from sources external to the system, Web data comprising Web traffic 
data, web server application program interface data and Web server log data, and 
Web transaction data comprising data relating to transactions completed over the 
Web; selecting data that is relevant to a desired output from among the collected 
data by mapping between general attributes and particular features, the selected 
data having reduced dimensionality relative to the collected data; pre-processing 
the selected data by removing redundant or irrelevant information from Web server 
log data, by identifying a visitor to a web site from the Web traffic data, 
reconstructing a session from the Web traffic data, by reconstructing a path 
followed by a visitor in a session from the Web server log data, by analyzing a 
path a whole Website from the Web server log data, by converting to filenames from 
the Web server log data to page titles, and by converting IP addresses from the Web 
traffic data to domain names-; building a plurality of database tables from the pre- 
processed selected data, wherein the acquired data comprises a plurality of 
different types of data; integrating the collected data by forming an integrated 
database comprising collected data in a coherent format using generated taxonomies 
to group attributes of the data and using generated profiles of the data; 
generating a plurality of da^ta mining models using the collected data; and 
generating a prediction or recommendation using at least one of the plurality of 
generated data mining models, in response to a received request for a 
recommendation or prediction. 
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6. A computer program product for performing an enterprise web mining process in an 
electronic data processing system, comprising: a computer readable medium; computer 
program instructions, recorded on the computer readable medium, executable by a 
processor, for performing the steps of: collecting data from a plurality of data 
sources, including proprietary corporate data comprising proprietary account or 
user-based data, external data comprising data acquired from sources external to 
the system, Web data comprising Web traffic data, web server application program 
interface data and Web server log data, and Web transaction data comprising data 
relating to transactions completed over the Web; selecting data that is relevant to 
a desired output from among the collected data by mapping between general 
attributes and particular features, the selected data having reduced dimensionality 
relative to the collected data; pre-processing the selected data by removing 
redundant or irrelevant information from Web server log data, by identifying a 
visitor to a web site from the Web traffic data, reconstructing a session from the 
Web traffic data, by reconstructing a path followed by a visitor in a session from 
the Web server log data, by analyzing a path a whole Website from the Web server 
log data, by converting to filenames from the Web server log data, to page titles, 
and by converting IP addresses from the Web traffic data to domain names; building 
a plurality of database tables from the pre-processed selected data, wherein the 
acquired data comprises a plurality of different types of data; integrating the 
collected data by forming an integrated database comprising collected data in a 
coherent format using generated taxonomies to group attributes of the data and 
using generated profiles of the data; generating a plurality of data mining models 
using the collected data; and generating a prediction or recommendation using at 
least one of the plurality of generated data mining models, in response to a 
received request for a recommendation or prediction. 

11. A system for performing an enterprise web mining process, comprising: a 
processor operable to execute computer program instructions; and a memory operable 
to store computer program instructions executable by the processor, for performing 
the steps of: collecting data from a plurality of data sources, including 
proprietary corporate data comprising proprietary account or user-based data, 
external data comprising data acquired from sources external to the system, Web 
data comprising Web traffic data, web server application program interface data and 
Web server log data, and Web transaction data " comprising data relating to 
transactions completed over the Web; selecting data that is relevant to a desired 
output from among the collected data by mapping between general attributes and 
particular features, the selected data having reduced dimensionality relative to 
the collected data; pre-processing the selected data by removing redundant or 
irrelevant information from Web server log data, by identifying a visitor to a web 
site from the Web traffic data, reconstructing a session from the Web traffic data, 
by reconstructing a path followed by a visitor in a session from the Web server log 
data, by analyzing a path a whole Website from the Web server log data, by 
converting to filenames from the Web server log data to page titles, and by 
converting IP addresses from the Web traffic data to domain names; building a 
plurality of database tables from the pre-processed selected data, wherein the 
acquired data comprises a plurality of different types of data; integrating the 
collected data by forming an integrated database comprising collected data in a 
coherent format using generated taxonomies to group attributes of the data and 
using generated profiles of the data; generating a plurality of data mining models 
using the collected data; and generating a prediction or recommendation using at 
least one of the plurality of generated data mining models, in response to a 
received request for a recommendation or prediction. 

16. An enterprise web mining system comprising: a database system coupled to a 
plurality of data sources, the database system operable to store data collected 
from the data sources, the data sources including proprietary corporate data 
comprising proprietary account or user-based data, external data comprising data 
acquired from sources external to the system, Web data comprising Web traffic data, 
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web server application program interface data and Web server log data, and Web 
transaction data comprising data relating to transactions completed over the Web, 
the database further operable to select data that is relevant to a desired output 
from among the collected data by mapping between general attributes and particular 
features, the selected data having reduced dimensionality relative to the collected 
data, the database further operable to pre-process the selected data by removing 
redundant or irrelevant information from Web server log data, by identifying a 
visitor to a web site from the Web traffic data, reconstructing a session from the 
Web traffic data, by reconstructing a path followed by a visitor in a session from 
the Web server log data, by analyzing a path a whole Website from the Web server 
log data, by converting to filenames from the Web server Jog data to page titles, 
and by converting IP addresses from the Web traffic data to domain names, the 
database further operable to build a plurality of database tables from the pre- 
processed selected data, wherein the acquired data comprises a plurality of 
different types of data, and the database further operable to integrate the 
collected data by forming an integrated database comprising collected data in a 
coherent format using generated taxonomies to group attributes of the data and 
using generated profiles of the data; a data mining engine coupled to the database, 
the data mining engine operable to generate a plurality of data mining models using 
the integrated database; a server coupled to a network, the server operable to 
receive a request for a prediction or recommendation over the network, generate a 
prediction or recommendation using at least one of the data mining models, and 
transmit the generated prediction or recommendation. 



DOCUMENT-IDENTIFIER: US 6745185 B2 

TITLE: System and method for online agency service of data mining and analyzing 



Abstract Text (1) : 

A system and method for an online agency service of data mining and analyzing is 
disclosed. The system and method can automatically fetch and analyze data stored in 
a remote source database (10)_based on a data analysis request originating from a 
client site (3) . Initially, the client site (3) sends a data analysis request to 
the service provider (2) . The service provider (2) converts the data analysis 
request into a standard format of query information, and searches the source 
database (10) . A plurality of data records are searched and written into a local 
database (23) contained in the service provider (2) . Finally, the service provider 
(2) analyzes the data stored in the local database, and generates a search report 
which is then sent to the client site (3) . 

Detailed Description Text (9) : 

Referring to FIG. 2, a system for an online agency service of data mining and 
analyzing (herein simplified as the agency service system) is shown. The agency 
service system comprises a source database 10 which may be linked to a Web site, a 
service provider 2, and a client site 3. A local database 23 is installed in the 
site of the service provider 2 for storing data extracted from the source database 
10. The service provider 2 herein is a server which contains executable software 
stored therein. A service procedure of the agency service system may be separated 




□ 3. Document ID: US 6745185 B2 

L27: Entry 3 of 8 



File: USPT 



Jun 1, 2004 



h eb bgeeef e 



ef be 



Record List Display 



Page 5 of 10 



into the following steps: (1) The client site 3 sends a data-analysis request to 
the service provider 2. (2) The service provider 2 converts the data analysis 
request into a standard format of query information. (3) The service provider 2 
sends the standard format of query information to the source database 10. (4) A 
searching engine attached to the source database 10 performs a data search and 
obtains a plurality of records of source data meeting the standard format of query 
information. (5) The service provider 2 performs extraction and classification on 
the obtained source data and downloads the extracted data to related columns of the 
local database 23. (6) The service provider 2 performs analysis on the data stored 
in the local database 23 and obtains an analysis report. (7) The service provider 2 
sends the analysis report to the client site 3 and charges the client site 3. 

Current US Original Classification (1) : 
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□ 4. Document ID: US 6718338 B2 

L27: Entry 4 of 8 File: USPT Apr 6, 2004 



DOCUMENT-IDENTIFIER: US 6718338 B2 

TITLE: Storing data mining clustering results in a relational database for querying 
and reporting 

Abstract Text (1): 

Storing data mining clustering results in a relational database for querying and 
reporting, including reading, from a hierarchical clustering node, clustering data 
describing a clustering, and recording the clustering data in a relational 
clustering record; reading, from a hierarchical cluster node embodied in the 
hierarchical representation of data mining results, cluster data describing a 
cluster, and recording the cluster data in a relational cluster record; reading, 
from a hierarchical cluster attribute node embodied in the hierarchical 
representation of data mining results, cluster attribute data describing a cluster 
attribute, and recording the cluster attribute data in a relational cluster 
attribute record; reading, from a hierarchical cluster attribute bin node embodied 
in the hierarchical representation of data mining results, cluster attribute bin 
data describing a cluster attribute bin, and recording the cluster attribute bin 
data in a relational cluster attribute bin record. 

Detailed Description Text (24): 

Persons of skill in the art will recognize that reading a single attribute bin in 
the hierarchical format requires traversing all the records in the hierarchy above 
the attribute bin every time the attribute bin record is read. Persons skilled in 
the art will recognize that when the key fields for an attribute bin record are 
known, then that attribute bin record can be accessed directly in a single read 
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operation in a relational database. Persons skilled in the art will recognize that 
reading a particular attribute bin record from a relational format practically 
never requires traversing a hierarchy as such. Persons skilled in the art will 
recognize that the illustrative example pseudocode set forth just above implements 
a traversal of the entire hierarchical PMML structure in order to fill in all the 
relational records. Persons of skill in the art will recognize that one of the 
principal advantages of the present invention is that by use of its embodiments, it 
is typically necessary to traverse a hierarchical format only once, in order to 
convert it to relational format, and that after that single traversal, typical 
embodiments provide all the speed and ease of access of the relational model for 
analysis, extraction, querying, and reporting upon the results of data mining. 

Current US Original Classification (1) : 
707/102 
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□ 5. Document ID: US 6477565 Bl 

L27: Entry 5 of 8 File: USPT Nov 5, 2002 



DOCUMENT- IDENTIFIER: US 6477565 Bl 

TITLE: Method and apparatus for restructuring of personalized data for transmission 
from a data network to connected and portable network appliances 

Abstract Text (1) : 

A system for retrieving and disseminating information records from Internet sources 
includes a client device and an intermediary server system, including software, 
between the client device and the Internet. The system collects a record specific 
to a client from an individual one of said Internet sources in a first form in 
which the record is recorded at the Internet source, transforms the record from the 
first form to a second form specific to an application other than an Internet 
browser application, the application executable by the client device, and transmits 
the transformed record to the client device for display in the application other 
than an Internet browser application executable by the client device. In some cases 
the client device connects by a data link that is not Internet-compatible link. 
Data mining on the Internet specific to clients and client devices is taught, with 
aggregation services and synchronization for keeping a client up-to-date 
efficiently for changing data content. 

Detailed Description Text (22) : 

The method and apparatus of the present invention provides a unique capability of 
restructuring data in an intelligent way. That is, instead of simply converting one 
format of data into another, a first data set is analyzed and understood so that an 
alternate data set in a format specific to applications executable on a receiving 
device may be created that reflects the desired content and function of the first 
data set. More detail about how this is accomplished is provided below. 

Current US Original Classification (1) : 
709/217 

Current US Cross Reference Classification (1) : 
709/246 

Current US Cross Reference Classification (2) : 
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□ 6. Document ID: US 6477538 B2 

L27: Entry 6 of 8 File: USPT Nov 5, 2002 



DOCUMENT-IDENTIFIER: US 6477538 B2 

TITLE: Data display apparatus and method for displaying data mining results as 
multi-dimensional data 



Abstract Text (1) : 

A data display apparatus and method for displaying the result of a data mining 
process as multi-dimensional data. In one embodiment, the multi-dimensional data is 
displayed on a parallel coordinate axis. An engine unit executes the data mining 
process on displayed multi-dimensional data according to an instruction from a 
visual data mining tool and transfers the result to the visual data mining tool. 
The user interface unit of the visual data mining tool generates an axis 
corresponding to the result of the data mining process, adds the axis to the 
parallel coordinate axis and displays the result of the data mining process on the 
added axis. 

Brief Summary Text (15) : 

Another form of a data display apparatus of the present invention comprises the 
following units: an input converting unit which receives the data mining analysis 
result on multi-dimensional data, and incorporates the analysis result into the 
multi-dimensional data to be displayed; and a display controlling unit which 
displays the analysis result on the display apparatus based on the output of the 
converting unit in the predetermined format of a graph. 

Drawing Description Text (49) : 

FIGS. 48A and 48B are diagrams showing the process of converting the result of a 
decision tree analysis to range information on a graph. 

Current US Original Classification (1) : 
707/102 

Current US Cross Reference Classification (1) : 
707/5 

CLAIMS : 

11.' A data display apparatus for displaying multi-dimensional data on a graph in a 
predetermined graphical format having a coordinate axis system, comprising: input 
converting means for receiving an analysis result from automatically executing a 
data mining process on the multi-dimensional data and incorporating the analysis 
result into the multi-dimensional data to be displayed; and display controlling 
means for displaying the analysis result on the display apparatus on the same graph 
displaying the multi-dimensional data, based on the output of said input converting 
means, wherein the graphical format used to display the multi-dimensional data is 
the same as the graphical format used to display the data mining result. 
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□ 7. Document ID: US 6473741 Bl 

L27: Entry 7 of 8 



File: USPT 



Oct 29, 2002 



DOCUMENT-IDENTIFIER: US 6473741 Bl 

TITLE: Method and system for aggregation and exchange of electronic tax information 
Abstract Text (1): 

A process that arranges information warehoused at individual accounting and tax 
preparation firms at a central location for the purpose of marketing information. 
Data contained at these firms have qualitative and quantitative characteristics 
that are different from data archived at the Internal Revenue Service or other tax 
authorities. This fact makes the data valuable as data in two ways. First the data 
can be exchanged to provide new revenue streams. Secondly, these data, if grouped 
into data warehouses of other firms, has value as pure data, not just customer 
lists. These data may be sold or rented creating additional revenue streams for 
their originators. The purchasers of this bulk data are interested in using this 
data in the field of data mining . Data mining is a technique of analyzing vast 
amounts of information to uncover relationships to predict events and has wide 
application in many areas of the economy. 

Detailed Description Text (36) : 

First, a service bureau 20 can provide no cost or very low cost off site archival 
of data. Backing up firm 10 data is a critical function that is frequently 
overlooked by smaller accounting and tax preparation firms. Secondly, a service 
bureau 20 can provide no cost or very low cost transmission of electronically filed 
income tax returns. Currently, most firms 10 pay a user fee to their software 
vendor for this service. Then, data is stored in detail with associated identifying 
characteristics of the taxpayers such as name, social security numbers, and 
addresses. These data are to be stored on a separate system 30 that protects the 
confidentiality of each taxpayer and may only be released with proper authorization 
procedures and controls. These data are also converted to an electronic format 
suitable for retrieval by users requesting information such as a mortgage lender 
The format will enable mortgage lenders to directly download the complete tax 
return into their analysis software and/or credit scoring software. 

Current US Cross Reference Classification (3) : 
707/3 
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Abstract Text (1) : 

A data display apparatus and method for displaying the result of a data mining 
process as multi-dimensional data. In one embodiment, the multi-dimensional data is 
displayed on a parallel coordinate axis. An engine unit executes the data mining 
process on displayed multi-dimensional data according to an instruction from a 
visual data mining tool and transfers the result to the visual data mining tool. 
The user interface unit of the visual data mining tool generates an axis 
corresponding to the result of the data mining process, adds the axis to the 
parallel coordinate axis and displays the result of the data mining process on the 
added axis. 



Brief Summary Text (15): 

Another form of a data display apparatus of the present invention comprises the 
following units: an input converting unit which receives the data mining analysis 
result on multi-dimensional data, and incorporates the analysis result into the 
multi-dimensional data to be displayed; and a display controlling unit which 
displays the analysis result on the display apparatus based on the output of the 
converting unit in the predetermined format of a graph. 

Drawing Description Text (49) : 

FIGS. 4 8A and 4 8B are diagrams showing the process of converting the result of a 
decision tree analysis to range information on a graph. 



Current US Original Classification (1) : 
707/102 



Current US Cross Reference Classification ( 1) : 
707/5 
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